Models have many uses

  • Organizing data
  • Developing theoretical principles
  • Exploring hypotheticals
  • Guiding policy
  • Explaining observed patterns

Mechanistic models as scientific instruments

In seeking to understand biological phenomena, models are tools for looking at the data.

They focus attention on the discrepancies between our hypotheses and reality.

How to quantify these discrepancies?

The key principle of statistics is that we must model the error.

Mechanistic models as scientific instruments

  • In seeking to infer causal mechanism in intact biological systems, dynamics are especially informative.
  • Hypotheses about causal mechanisms can often readily be formalized as dynamical systems models.
  • The problems of inference
    • How can we estimate unknown parameters?
    • How can we decide among competing models?
    • How can we extract maximum information from time series data?
  • When do we know when we need a better model vs better data?

Sources of error

The key principle is that we must model the error.

  • Measurement error
    • Error
    • Finite precision
  • Process noise
    • Environmental trends and fluctuations
    • Unmodeled heterogeneities
    • Model misspecification
    • Unmodeled variables

Overview

  1. Markov models
  2. Examples: cholera and pertussis
  3. Partially observed Markov processes
  4. Plug and play inference methods
  5. Plug and play methods in practice
  6. Conclusions and recommendations

The dialogue between scientist and Nature

  • We have questions about the causes of things…
  • …but Nature does not answer our questions directly.
  • We must propose hypothetical answers to our questions.

The dialogue between scientist and Nature

  • We have questions about the causes of things…
  • …but Nature does not answer our questions directly.
  • We must propose hypothetical answers to our questions.
  • Models are nothing other than hypothetical answers to questions we ask.
  • Models are necessarily partial answers, at best.

The dialogue between scientist and Nature

  • We have questions about the causes of things…
  • …but Nature does not answer our questions directly.
  • We must propose hypothetical answers to our questions.
  • Models are nothing other than hypothetical answers to questions we ask.
  • Models are necessarily only partial answers.
  • It is relatively easy to ask difficult questions.
  • It is harder, but still relatively easy, to propose creative hypothetical answers.
  • It is often much more difficult to hear what the data have to say.

The dialogue between scientist and Nature

  • Before the data will give a response, they must be asked in the proper way.
  • Their responses are always equivocal, but some answers are better than others.
  • One difficulty: how do we quantify this?
  • Another difficulty: the answer we get depends on the question we ask.
    How do we avoid misunderstanding the reply?

Quantifying discrepancy between model and data

Quantifying discrepancy between model and data

Quantifying discrepancy between model and data

  • The natural and optimal quantification is in terms of surprise: \(-\log{p}\)
  • A better answer is one in which the data are less surprising.
  • In this view, a model must be generative before Nature will respond.
  • Thus, properly speaking, a model is a probability distribution.
  • We must model the noise and the error.

Sources of noise and error

  • Measurement error
  • Process noise
    • Environmental trends and fluctuations
    • Unmodeled heterogeneities
    • Model misspecification
    • Unmodeled variables

Stochastic models

Stochastic models

Stochastic models

Stochastic models

Stochastic models

Stochastic models

\[ \begin{aligned} N_{\emptyset{X}}(t)&=\text{cumulative number of births into X by time $t$}\\ N_{XY}(t)&=\text{cumulative number of X $\to$ Y movements by time $t$}\\ N_{Y\emptyset}(t)&=\text{cumulative number of Y $\to$ $\emptyset$ movements by time $t$}\\ \end{aligned} \]

Stochastic models

\[ \begin{aligned} dN_{\emptyset{X}}&=B(t)\,dt \phantom{+\sigma_1(t)\,dW_1(t)}\\ dN_{XY}&=\lambda(Y)\,X\,dt \phantom{+\sigma_2(X,Y)\,dW_2(t)}\\ dN_{Y\emptyset}&=\mu\,Y\,dt \phantom{+\sigma_3(Y)\,dW_3(t)}\\ \end{aligned} \]

Stochastic models

\[ \begin{aligned} dN_{\emptyset{X}}&=B(t)\,dt +\sigma_1(t)\,dW_1(t)\\ dN_{XY}&=\lambda(Y)\,X\,dt +\sigma_2(X,Y)\,dW_2(t)\\ dN_{Y\emptyset}&=\mu\,Y\,dt +\sigma_3(Y)\,dW_3(t)\\ \end{aligned} \]

\[dW_i\;\stackrel{i.i.d}{\sim}\;\mathrm{Normal}(0,\sqrt{dt})\]

Stochastic models

\[ \begin{aligned} dX&=B(t)\,dt-dN_{XY}\\ &=B(t)\,dt-\lambda(Y)\,X\,dt+\sigma_1(t)\,dW_1(t)-\sigma_2(X,Y)\,dW_2(t)\\ dY&=dN_{XY}-dN_{Y\emptyset}\\ &=\lambda(Y)\,X\,dt+\sigma_2(X,Y)\,dW_2(t)+\sigma_3(X)\,dW_3(t)\\ \end{aligned} \]

Endemic cholera

What are the roles of seasonal and decadal climate drivers in the epidemiology of cholera?

What is the best vaccination strategy?

Endemic cholera

Endemic cholera

Endemic cholera

Endemic cholera

Endemic cholera

Endemic cholera

Questions

  1. Roles of seasonal and decadal climate drivers
    • complex seasonality
    • multiennial cycles
  2. Importance of bacteriophage in environment
  3. Relative importance of human-human vs environmental transmission
  4. Durations of vaccine- and infection-induced immunity

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

Transmission models for endemic cholera

A large family of models:

  • How well do each of the models explain the data?
  • What insights can we gain into the system’s dynamic self-regulation?

These questions involve inescapable technical complications:

  • nonlinearity
  • stochasticity
  • nonstationarity
  • time-varying parameters

Questions

  1. Roles of seasonal and decadal climate drivers
    • complex seasonality
    • multiennial cycles
  2. Importance of bacteriophage in environment
  3. Relative importance of human-human vs environmental transmission
  4. Durations of vaccine- and infection-induced immunity

The ongoing pertussis resurgence


(Lavine, King, and Bjørnstad 2011)

The ongoing pertussis resurgence


(Domenech de Cellès, Magpantay, King, and Rohani 2016)

Question

Why is pertussis resurgent?

Hypothetical answers include:

  • Changes in vaccine efficacy
  • Vaccine-driven pathogen evolution
  • Increased circulation of congeneric pathogens
  • Asymptomatic transmission
  • Loss of natural immune boosting
  • Increased surveillance sensitivity

Key unknowns

  • durability of vaccine-induced immunity
  • relative efficacy of natural- and vaccine-derived immunity

Modes of vaccine failure

\[\lambda(I_1,I_2,t) = \frac{\beta(t)\,(I_1+\theta\,I_2)+\bar{\beta}\,\iota}{N}\]

Post-vaccination infections are observed at a reduced rate, \(\eta\).

(Magpantay, Domenech de Cellès, Rohani, and King 2016)

Modes of vaccine failure

  • Can we estimate the nature and durability of vaccine-induced protection?
  • Which models explain the data adequately?

(Magpantay et al. 2016)

Pertussis resurgence in progress

Pertussis resurgence in progress

Pertussis resurgence in progress

Pertussis resurgence in progress

Pertussis resurgence in progress

Pertussis resurgence in progress

Pertussis in Massachusetts

(Domenech de Cellès, Magpantay, King, and Rohani 2018)

Modes of vaccine failure

Modes of vaccine failure

(Domenech de Cellès et al. 2018)

Modes of vaccine failure

  • Can we estimate the nature and durability of vaccine-induced protection?
  • Which models explain the data adequately?
  • Questions of parameter estimation and model selection

(Domenech de Cellès et al. 2018)

Partially observed Markov processes

Partially observed Markov processes


(King, Nguyen, and Ionides 2016)

Partially observed Markov processes


(King et al. 2016)

Partially observed Markov processes


(King et al. 2016)

Partially observed Markov processes


(King et al. 2016)

Partially observed Markov processes


(King et al. 2016)

The “plug-and-play” property

Definition: an algorithm has the plug-and-play property if it has no need to compute the latent process transition density.

Plug-and-play methods access the latent process model only via simulation.

This puts essentially no restrictions on the form of the models that can be entertained.

They are also called “simulation-based” methods.

(He, Ionides, and King 2010; King et al. 2016)

Simulation-based inference methods

  • Feature-based methods
    • Approximate Bayesian computation (ABC)
    • Nonlinear forecasting (NLF)
    • Attractor-reconstruction-based methods
    • Probe matching
    • Synthetic likelihood
  • Full-information methods (likelihood based)
    • Sequential Monte Carlo (the particle filter)
    • Particle Markov chain Monte Carlo
    • Iterated filtering

Variants of all of these are available in the pomp software package (King et al. 2016; [https://kingaa.github.io/pomp/]) and elswhere.

Pertussis and vaccine failure

\[\lambda(t) = \frac{\beta(t)\,(I_1+\theta\,I_2)+\bar{\beta}\,\iota}{N}\]

Post-vaccination infections are observed at a reduced rate, \(\eta\).

(Magpantay et al. 2016)

Profile likelihood

Interpretation: in vaccinated hosts, infections are mild to asymptomatic, yet equally infectious

Limits to information

Flat profiles indicate lack of information in the data relative to the question.

In effect, the data refuse to answer the question.

(Magpantay et al. 2016)

Endemic cholera


(King, Ionides, Pascual, and Bouma 2008)

Profile likelihood

(King et al. 2008)

Profile likelihood

(King et al. 2008)

Endemic cholera

Is it necessary that all infected individuals be equally infectious?
(King et al. 2008)

Asymptomatic infections

among symptomatic infections, case fatality: \(0.34\pm 0.2\)
duration of immunity \(1.5 \pm 0.7~\text{yr}\)
(King et al. 2008)

Endemic cholera

Perspective

  • Simulation-based methods make it possible to obtain answers to questions posed in the form of models that embody our precise questions.
  • One need not carefully tailor the statistical algorithm to the model.
  • It is perilous to invest too much time in one model: the Pygmalion problem.

Conclusions

  • To obtain answers to our questions, we must pose them properly: as generative, stochastic models.
  • Therefore, we must take care to model the noise.
  • Simulation-based inference methods facilitate scientific investigation.
  • Effective, maximally efficient inference methods are available.
  • There is an intense need for further methodological development of such methods to improve computational efficiency and accommodate new data types.

Common model comparison criteria

  • The best model agrees with already published results.
  • The best model is the one that I can estimate before my thesis is due.
  • The best model is the one that best agrees with my preconceptions.
  • The best model is the one that makes my supervisor the happiest.
  • An identifiable model is better than a non-identifiable model.
  • The best model is the one with the narrowest confidence intervals.
  • The best model is the one with the lowest AIC (or BIC, or ….).

Valid model comparison criteria

  • How well does the model fit the data?
    • How does the likelihood compare to benchmark values?
    • Does the model fit well? (fitting best \(\ne\) fitting well)
  • How does the model explain the data?
    • What aspects of the data are explained?
    • What aspects are ascribed to noise?
    • What do the parameter estimates predict about the system?
  • How adequate is the model?
    • Are the data consistent with the model’s assumptions?
    • If the model were correct, how plausible are the data?
    • How sensitive are the estimates to model assumptions?

Valid model comparison criteria

  • Predictive power
    • Can the model make accurate out-of-sample predictions?
  • What predictions to parameter estimates make about the results of independent studies?
  • Fertility
    • Does the model make testable predictions?

Partially observed Markov processes

  • observation times: \(t_1 < t_2 < \dots < t_N\)
  • data: \(y_{t_n}^*\)
  • unobserved Markov state process: \(X_t\)
  • measurement process: \(Y_t\)
  • process model (Markov state transition density): \(f_n(X_{t_n}|X_{t_{n-1}},\theta)\)
  • measurement model: \(g_n(Y_{t_n}|X_{t_{n-1}},\theta)\)
  • initial state distribution \(f_0(X_{t_0},\theta)\)

(King et al. 2016)

Central problem

The full joint density is:

\[f_{X,Y}(x,y;\theta) = f_0(x_0;\theta)\,\prod_{n=1}^N\!f_{n}(x_n|x_{n-1};\theta)\,g_{n}(y_n|x_n;\theta).\]

The likelihood function is the marginal density for \(Y\), evaluated at the data:

\[ \begin{split} \mathcal{L}(\theta)&=f_{Y}(y^*_1,\dots,y^*_n;\theta)\\ &=\int f_{X,Y}(x_0,\dots,x_N,y^*_1,\dots,y^*N;\theta)\, dx_1\cdots dx_n. \end{split} \]

Synthetic likelihood

Given parameters \(\theta\in\Theta\) and data \(y^*\in\mathcal{Y}\)

  • Select probes (summary statistics), \(\mathbb{S}:\mathcal{Y}\to\mathbb{R}^d\)
  • Compute observed probes, \(s^* = \mathbb{S}(y^*)\)
  • Simulate \(J\) datasets \(y_j\) and compute simulated probes \(s_j=\mathbb{S}(y_j)\)
  • Using sample mean, \(\mu = \langle s_j \rangle\), and variance, \(\Sigma = \langle (s_j-\mu)\,(s_j-\mu)^T \rangle\), compute the synthetic log likelihood \[\ell_{\mathbb{S}}(\theta) = -\tfrac{1}{2}\,(s^*-\mu)^{T}{\Sigma}^{-1}(s^*-\mu)-\tfrac{1}{2}\,\log((2\pi)^d|\Sigma|)\]
  • E.g., maximize \(\ell_{\mathbb{S}}\) over \(\Theta\)

(Fasiolo, Pya, and Wood 2014; Wood 2010)

The particle filter

  • Given \(\theta\) and \(y^*\), initialize particle swarm by drawing \(X^F_{t_0,j}\sim f_0(\cdot|\theta)\), \(j=1,\dots,J\)
  • For \(n = 1,\dots,N\),
    • Simulate for prediction: \(X^P_{t_{n},j}\sim f_k(\cdot|X^F_{t_{n-1},j},\theta)\)
    • Evaluate weights: \(w_{n,j} = g(y^*_{t_{n}}|X^P_{t_{n},j},\theta)\)
    • Perform weighted resample of \(\{X^P_{t_n,j}\}_{j=1}^J\) to obtain \(\{X^F_{t_n,j}\}_{j=1}^J\)
    • Estimate conditional log likelihood: \(\ell_{n}(\theta) = \log\mathrm{Prob}[y^*_{t_n}|y^*_{t_1},\dots,y^*_{t_{n-1}}] = \log\tfrac{1}{J}\sum_j\!w_{n,j}\)
  • Unbiased estimate of the log likelihood: \(\ell(\theta) = \sum_n\,\ell_n(\theta)\)

(Arulampalam, Maskell, Gordon, and Clapp 2002)

Particle Markov chain Monte Carlo

  • Given \(\theta_0\), compute \(\ell(\theta_0)\) using the PF
  • For \(m>0\), draw \(\theta^P_m\sim q(\cdot|\theta_{m-1})\)
  • Compute \(\ell(\theta^P_m)\) using the PF
  • Generate \(U\sim\mathrm{Uniform}(0,1)\)
  • Set \(\theta_m=\begin{cases}\theta^P_m, &\text{if } U<\frac{f_\Theta(\theta^P_m)\exp(\ell(\theta^P_m))}{f_\Theta(\theta_{m-1})\exp(\ell(\theta_{m-1}))},\\ \theta_{m-1},&\text{otherwise.}\end{cases}\)
  • Repeat until the Markov chain converges

Because the particle filter gives as unbiased, though noisy, estimate of the likelihood, this MCMC converges to the correct target distribution.

(Andrieu, Doucet, and Holenstein 2010)

Iterated filtering

  • main problem with particle filter: particle depletion
  • basic idea of iterated filtering:
  • replace fixed parameters by parameters that execute a random walk, e.g., \[\theta \mapsto \theta_n \sim \mathrm{Normal}(\theta_{n-1},\sigma)\]
  • this combats particle depletion by introducing variability
  • also permits local exploration of the likelihood surface
  • take \(\sigma\to 0\) iteratively
  • analogies with simulated annealing

(Ionides, Bretó, and King 2006; Ionides, Nguyen, Atchadé, Stoev, and King 2015; King et al. 2016)

Iterated filtering

  • Choose starting point \(\theta_0\) and random-walk perturbations \(\sigma_{m}\), \(m=0,1,2,\dots\).
  • Initialize particle swarm by drawing \(\theta^F_{j,0} \sim \mathrm{Normal}(\theta_0,\sigma_0)\)
  • For \(m>0\), run PF to obtain \(\{\theta^F_{j,m}\}_{j=1}^J\)
  • Repeat unto convergence
  • The \(\theta^F\) converge to the maximum likelihood estimate of \(\theta\)

(Ionides et al. 2015)

Relative merits

  • SL throws away information, but gains robustness
  • Many choices are possible for SL probes, with consequences for inference
  • IF converges much faster than PMCMC
  • Frequentist vs Bayesian inference styles
  • Smoothed state trajectories can be computed by PMCMC
  • Probes can be used for post hoc checks of model adequacy

Recommendation: an integrated approach

  • Use IF, profile likelihood, and profile trace plots to identify heights in the likelihood surface.
  • Use PMCMC to obtain posterior densities and smoothed state trajectories if desired.
  • Use independent parameter estimates to diagnose model misspecification.
  • Use probes to ascertain model adequacy and diagnose model misspecification.

Exploiting the age structure

  • relative transmissibility 0.99 [0.40–1.00]
  • relative observability 0.39 [0.19–1.00]
  • high detection rates in adolescents and in adults: 24% [10–66%]
  • despite effectiveness, eradication via routine immunization alone is not possible
    (Domenech de Cellès et al. 2018)

References

Andrieu C, Doucet A, Holenstein R (2010). “Particle Markov Chain Monte Carlo Methods.” Journal of the Royal Statistical Society, Series B, 72(3), 269–342. https://doi.org/10.1111/j.1467-9868.2009.00736.x.

Arulampalam MS, Maskell S, Gordon N, Clapp T (2002). “A Tutorial on Particle Filters for Online Nonlinear, Non-Gaussian Bayesian Tracking.” IEEE Trans Signal Process, 50, 174–188. https://doi.org/10.1109/78.978374.

Domenech de Cellès M, Magpantay FMG, King AA, Rohani P (2016). “The Pertussis Enigma: Reconciling Epidemiology, Immunology, and Evolution.” Proceedings of the Royal Society of London. Series B,

Domenech de Cellès M, Magpantay FMG, King AA, Rohani P (2018). “The Impact of Past Vaccination Coverage and Immunity on Pertussis Resurgence.” Sci Transl Med, 10(434), eaaj1748. https://doi.org/10.1126/scitranslmed.aaj1748.

Fasiolo M, Pya N, Wood S (2014). “Statistical Inference for Highly Non-Linear Dynamical Models in Ecology and Epidemiology.” http://arxiv.org/abs/1411.4564

He D, Ionides EL, King AA (2010). “Plug-and-Play Inference for Disease Dynamics: Measles in Large and Small Populations as a Case Study.” J R Soc Interface, 7, 271–283. https://doi.org/10.1098/rsif.2009.0151.

Ionides EL, Bretó C, King AA (2006). “Inference for Nonlinear Dynamical Systems.” Proc Natl Acad Sci, 103(49), 18438–18443. https://doi.org/10.1073/pnas.0603181103.

Ionides EL, Nguyen D, Atchadé Y, Stoev S, King AA (2015). “Inference for Dynamic and Latent Variable Models via Iterated, Perturbed Bayes Maps.” Proc Natl Acad Sci, 112(3), 719–724. https://doi.org/10.1073/pnas.1410597112.

King AA, Ionides EL, Pascual M, Bouma MJ (2008). “Inapparent Infections and Cholera Dynamics.” Nature, 454(7206), 877–880. https://doi.org/10.1038/nature07084.

King AA, Nguyen D, Ionides EL (2016). “Statistical Inference for Partially Observed Markov Processes via the R Package Pomp.” J Stat Softw, 69(12), 1–43. https://doi.org/10.18637/jss.v069.i12.

Lavine JS, King AA, Bjørnstad ON (2011). “Natural Immune Boosting in Pertussis Dynamics and the Potential for Long-Term Vaccine Failure.” Proceedings of the National Academy of Sciences of the U.S.A., 108(17), 7259–7264. https://doi.org/10.1073/pnas.1014394108.

Magpantay FMG, Domenech de Cellès M, Rohani P, King AA (2016). “Pertussis Immunity and Epidemiology: Mode and Duration of Vaccine-Induced Immunity.” Parasitology, 143, 835–849. https://doi.org/10.1017/S0031182015000979.

Wood SN (2010). “Statistical Inference for Noisy Nonlinear Ecological Dynamic Systems.” Nature, 466, 1102–1104. https://doi.org/10.1038/nature09319.